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ABSTRACT 


This study was conducted to determine quantitative measures 
of reliability for operational software in embedded avionics com- 
puter systems. Analysis was carried out on data collected during 
flight testing and from both static and dynamic simulation testing. 
Failure rate was found to be a useful statistic for estimating 
software quality and recognizing reliability trends during the 
operational phase of software development. The scope of the 
analysis was limited due to insufficient environment where ade- 
quate maintenance and service records for avionics systems are 
kept. 
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I.. INTRODUCTION 


This is the final report of the third phase of the 
Measurement of Software Reliability Study conducted for the 
NASA Langley Research Center, under contract NASI -14392. 

The purpose of this study was to develop quantitative 
measures of reliability for operational avionics software 
systems. Previous studies (References 1 and 2 ) analyzed 
measures of software reliability using data acquired during the 
development phase of a data base software product ,- which was 
designed to operate in a typical batch environment. Failure 
rate and failure ratio were found to be statistically valid 
measures for predicting software reliability during the 
development phase. 

During this study, an attempt was made to further examine 
the statistical attributes of these two measures as quantities 
for estimating and predicting software reliability during the 
operational phase of a software product. However, availability 
of adequate data to conduct reliability measurement analysis 
has been a very limiting element in this study. In order to 
support this study, data reporting was required on error 
frequency, cause of error, and time required to isolate and 
correct errors and recertify the software. Unfortunately the 
data sources contacted during the study were software 
development and maintenance groups that retain software error 
data primarily for diagnostic purposes. 

Much of the burden of collecting and assembling the data 
rested upon the development groups because of the distributed 
nature of the data. Every attempt was made to select data that 
were collected during the final test and verification stages of 
the development cycle; thus, the code maturity would be near 
that of an initial operational release. The trouble report 
data were correlated with CPU time data to establish analytic 


1 


data sets. These data sets consist of composed data, collected 
during static simulation testing, dynamic simulation testing, 
and actual flight testing. 
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II. OVERVIEW 


Prior to detailed discussion of the operational software 
measures studied, it is important to consider what is meant by 
software reliability. To a great extent this is dependent upon 
the software application intended. 

To a group responsible for the design of advanced computer 
systems, the issue of reliability is of central concern. The 
ability of the software to operate correctly as the system 
environment changes, such as in the fault tolerant technology, 
is a major element affecting the overall system performance. 

To a group responsible for design, implementation, and 
validation of software products, the issue of reliability is 
commonly replaced by the issue of quality. Any large or 
sophisticated operational software product contains an 
unidentifiable number of errors. Although these errors may be 
due to coding, formulation, or design, the central problem 
facing the development group is the identification and 
correction of as many of these errors as possible prior to 
release. Clearly the fewer the errors remaining in the 
product, the higher the quality. In this environment a 
"software failure" is hardly applicable. The software never 
fails to operate - it always operates, either correctly or 
incorrectly. Correct operation however, is not always formally 
specified and generally includes implied requirements. 

It is apparent that given identical specifications, two 
different software development groups will each produce a 
product of different quality. Similarly, differing 
specifications for the same product submitted to a single 
development group will result in products of different 
quality. The functions of specification and implementation 
each directly contribute to software quality. 
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The central issue facing a software development group is 
that of determining when the product is "ready" for operational 
release. Established practices include extensive laboratory 
testing in both static and dynamically controlled environments, 
testing by an independent software group, flight testing, and 
acceptance testing by the eventual user. A typical procedure 
used for identifying and correcting errors consists of trouble 
reporting. Each report is fully investigated by a software 
systems engineer and corrective action, if any, is 
recommended. At this point the report and recommended action 
are reviewed and an action decision is made. Any code 
corrections are entered into the next modification and testing 
continues . 

This system provides tight control over the identification 
and correction of software errors. Also, it provides formal 
documentation on detected software errors. Generally however, 
the accumulated data resulting from this procedure does not 
include CPU time or the number of times the software has been 
executed in a given time period. Hence, it is difficult to 
quantitatively establish software quality or reliability at the 
time the code is released for operational service. 

Such quantitative measures could be readily used by the 
development group, the program management group, and the 
operations group to aid in planning, costing, and decision 
making tasks. These applications should be pursued for their 
inherent value; however, quantitative measures have yet another 
application. Quantitative measures of software quality or 
reliability can assist the systems designer in analyzing 
software/hardware interactions and in quantitatively specifying 
the overall system performance. 

Previous studies (References 1 and 2) have assessed the 
various merits and properties of several reliability measures. 
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The failure ratio, U, is defined as 

U = F/N 

where F is the number of failures or software errors observed 
in N runs in a given calendar period, usually one month. The 
failure rate, FR, is defined as 

FR = f/t 

where f is the number of software errors observed during the 
total CPU time accumulated over a given calendar period. 
Additionally, the indicator MTBF is defined as 

MTBF = 1/FR 

which is an important quantity because it is analogous to 
commonly used hardware reliability expressions. 

The principle indicator derived and analyzed from this 
study was the failure rate, with MTBF presented for comparative 
and illustrative purposes. The form in which the data was 
available, in effect dictated the use of failure rate. During 
avionic software development, CPU time has been demonstrated to 
be more easily collectable than the number of initial program 
loads ( IPL) , regardless of the type of testing, i.e., flight 
testing or dynamic simulation testing. Additionally, flight 
time and ground time are traditionally well maintained 
statistics for aircraft, and hence for their avionics systems. 
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III. DATA RESOURCES 


The operational environment selected for this study was an 
embedded avionics computer system. An attempt was made to 
obtain error data from several different avionics systems. The 
two contractually designated data sources were the A-7 Avionics 
Development Program, Naval Weapons Center, China Lake, 
California, and the F-lll Operational Flight Program, 

Sacramento Air Logistics Center at McCellan Air Force Base, 
Sacramento, California. In addition, the Aerospace Corporation 
identified four other potential data sources: the F-14 

avionics computer program office at the Pacific Missile Test 
Center, Point Magu, California, the Naval Air Development 
Center, Warmister, Pennsylvania, the Air Force Avionics 
Laboratory, Wright Patterson Air Force Base, Dayton, Ohio, and 
the Jet Propulsion Laboratory, Pasadena, California. A 
preliminary investigation, was conducted at these facilities to 
determine the availability of data suitable for a reliability 
measurement study. 

Naval Weapons Center (NWC) 

Software error data from the embedded avionics computer 
system on the A-7 aircraft was available for this study from 
the Naval Weapons Center. The data came from two major 
software releases. The NWC-2 software package provided actual 
operational flight time data and the NWC-3 release provided 
data from the final test and evaluation phase of the software 
system, including more than 5000 hours of flight time and 
simulation time. 

During the operational lifetime of a software release, 
errors are isolated by a full investigation of all trouble 
reports. There is a formal mechanism for reporting system 
computer errors; however, many reports are verbal and complete 
documentation for all reported errors does not exist. When it 
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was available, information on the frequency of error 
occurrences existed only in narrative form. After an actual 
software error is catalogued, an operational fix is generated 
and the code is corrected in the next release. The time and 
effort required to investigate each report was not documented, 
but it was estimated that a fix takes about one month. 

Before the NWC-2 avionics software package was released in 
May, 1977 for operational use, through testing and evaluation 
phases were performed to detect many of the errors in the 
software. An additional period of verification and validation 
was used so that by the end of the examinations the software 
was essentially error free. The data was collected from 
various aircraft flights over a period of thirteen months from 
May, 1977 through May, 1978. It was acquired during a visit to 
the Naval Weapons Center on January 22, 1979. 

Three errors were detected following the release of the 
NWC-2 software. One was detected by the fleet almost 
immediately after the release, and two by the Naval Weapons 
Center early in the release. Although the exact time of the 
error detections is not known, the data included the total 
number of flight hours for each month, the number of errors 
detected, the type of error (fatal, critical, or non-crit icall , 
and the month the error was corrected. The monthly flight 
hours and the number of planes in the fleet are classified 
information; however, the total monthly flight hours are 
unclassified and have been used in the data analysis. 

The NWC-3 was released during the test and evaluation phase 
in January, 1979. NWC collected error data during the 
verification and validation phase which began at the end of 
February and continued until the end of April. During this 
period of acquisition, nine codes were released, denoted here 
6B, 6D , 6F , 7A, 8A. 8B. 8C- and 8T1 . The period of operation 
for each of the codes is given in Table 1. Note that the codes 
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TABLE 1 


NWC-3 CODE RELEASE DATES 


Code 

6B 

6D 

6E 

7A 

8A 

8B 

8C 

8D 


( Flight time 


Operational Dates 


15 

Aug 

78 ■ 

- 26 

Sep 

78 

13 

Sep 

78 ■ 

• 23 

Oct 

78 

28 

Sep 

78 ■ 

■ 20 

Nov 

78 

29 

Nov 

78 - 

■ 9 

Jan 

79 

3 

Jan 

79 - 

• 24 

Jan 

79 

24 

Jan 

79 ■ 

• 16 

Mar 

79 

8 

Mar 

79 




14 

Mar 

79 - 

■ 27 

Mar 

79 


* 


$ Weapons Lab ) 


< Simulation Lab y 

< 1 1 1 1 1 1 f 1 1 1 1 1 1 1 ) 

J A S ONDJ FMAMJJA 

1978 1979 


9 


7A and 8A, and codes 8B and 8C have overlapping flight dates as 
shown in the time table. The data came from three sources: 
actual test flights, the weapons laboratory and the simulation 
laboratory. This data included CPU time, number of errors 
encountered, and detailed trouble reports about the types of 
errors found, with a brief description of each. Naval Weapons 
Center personnel maintained time logs during test flights and 
dynamic simulation tests, and it was thought that it would be 
possible to correlate computer time and errors from this data. 


Sacramento Air Logistics Center 

The F-lll avionics system consists of three computers: the 

Guidance and Navigation Computer (GNC) for general navigation 
tasks, the Weapons Defense Computer (WDC) for weapons 
deliverance tasks, and the Navigation Computer Unit (NCU) for 
navigation and control tasks. The original software was 
developed by the General Dynamics Corporation. The F-lll 
software section at McClellan is responsible for the 
development, integration, and test and evaluation of the 
software . 

The F-lll software management is based on an 18-month 
life-cycle. User requirements, and changes in mission 
requirements that affect the overall system and cost are 
thoroughly reviewed, resulting in a coordinated block of 
modifications. These undergo full-scale static and dynamic 
testing during the development phase and are then installed and 
implemented in the system. There is an independent test and 
evaluation (IT$E) of the code which is usually performed by a 
contractor test team. The IT§E phase is conducted, for the 
most part, on a dynamic simulator. The trouble reports 
generated during the IT§E phase are the source of the software 
error records. The software section also maintains accounting 
records from which mean time to repair (MTTR) may be 
calculated. The next phase is engineering flight testing, which 
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is performed on an instrumented aircraft. Finally, user flight 
testing is conducted on non-instrumented aircraft. 

Although this facility was designated as a good data source 
prior to study execution, further investigation revealed that 
no logs were kept of flight time or CPU time. Because it was 
determined that several man months would have been required in 
order to extract any relevant data, it was decided not to • 
pursue this source further. 

Pacific Missile Test Center (PMTO 

The Pacific Missile Test Center's formal system for 
cataloging all avionics computer system errors is called the 
Airborne Weapons Corrective Action Program (AWCAP). Software 
errors are found by a full investigation of the trouble reports 
collected during the development and operational lifetime of 
the software release. Each reported problem is entered into 
the computer data base and updated whenever more information is 
received. The trouble report contains the following items: 

System component identification 
Problem brief 

Occurrences (including date and source of the report") 

Problem description 

Configuration 

Corrective action 

Action summary 

References 

The reliability data are embedded in the problem 
description, the corrective action taken, and the action 
summary, all of which are in narrative form. AWCAP provides 
considerable sorting and reporting capabilities; however, this 
system provided no established procedure for flagging timing 

information, so it was decided that this data source could not 
be utilized. 
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Naval Air Development Center (NADC) 


The NADC software development cycle is similar to that of 
the A- 7 office at NWC and to that of the F-14 office at the 
Pacific Missile Test Center. Computer logs were kept during 
the test and evaluation phases that could be used to calculate 
the run time of the software over each day. From these it 
might have been possible to correlate data from the trouble 
reports. It was decided that no data would be collected from 
the Naval Air Development Center because the P-3 software data 
was quite similar in form and type to that of the Naval Weapons 
Center, and personnel at this facility maintained that 
approximately three man-months would have been necessary in 
order to extract the required data. 

Air Force Avionics Laboratory (AFAL) 

AFAL was chosen to perform the independent verification and 
validation testing on the F-16 avionics software developed by 
the General Dynamics Corporation. Testing had been completed 
on six versions of the flight tape by January, 1979. AFAL was 
using the production tape to train fleet pilots in the Pilot 
Training Operation (PTO). A total of 48 hours of flight time 
was logged on two separate dates by various training pilots, 17 
hours in January, 1979 and 31 hours in February, 1979. They 
collected software reliability data during this time, including 
CPU time, and number and type of errors. 

Delivery of the data was made by Major John Weber of AFAL 
in March, 1979. It consisted of two months of data which were 
very similar to the operational data from the NWC- 2 software. 
Although an agreement was made to supply as much data as needed 
for statistical analysis, the final shipment of data was never 
received, and the data supplied was too limited to be 
statistically significant. 
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Jet Propulsion Laboratory (JPL) 

The Development Section of JPL's Mission Control Computer 
Center (MC ) has for three years collected software error 
data from the Voyager ground base real-time software computer 
system. It was thought that this large data base could be used 
to validate the statistical approach to the A-7 and F-16 
analysis, even though it came from a ground based system rather 
than an airborne system. 

During the three years the Vovager has flown, there have 
been only two errors involving the on-board computer. One was 
a memory hardware failure and the other was the transmittal and 
loading of an incorrect set of commands. In contrast, the 
software on the real-time ground based computer system has had 
between two and three thousand reported failures during the 
last three years. The records of these errors included the 
time of error occurrence and level of severity. This data was 
provided on a weekly basis from the 48th week of 1978 through 
the 20th week of 1979, covering a total of 9756.138 hours of 
operation. 


13 




IV. DATA ANALYSIS 


The data base consisted of two sets of data: (1) from the 

Naval Weapons Center, error data from the A-7 avionics software 
package, consisting of the NWC-2 utilization code and the NWC-3 
release, and (2) from the Jet Propulsion Laboratory, error data 
from the Voyager’s ground based computer system. The principal 
statistical analyses performed on this data were the 
calculation of the failure rate (FR) , number of failures 
divided by the CPU time accumulated over a given calendar 
period, and the mean time between failures (MTBF) . Evaluation 
of the data was determined from simple linear regression 
analysis of the failure rate on the successive codes and/or the 
release date. 

Other statistics were investigated, including the 
cumulative mean time between failures (total number of CPU time 
divided by the accumulative number of errors) and an 
exponential model for relating failure rates from one calendar 
period to the next, but these statistics did not yield any 
significant findings. Due to the lack of information on the 
number of runs in the data, the failure ratio (number of 
failures per calendar interval divided by the total number of 
runs) could not be calculated. 

Naval Weapons Center (NWC) 

Error severity was established for all the errors which 
were reported to have occurred during a software release. 
Severity ranged from critical to non-critical. A fatal error 
caused the system to fail completely; a critical error 
indicated that one part of the system failed, but the system 
continued to function with, perhaps, the wrong information; and 
a non-critical error was an annoyance type, such as a misnamed 
variable or a pilot preference for certain mechanisms or ways. 

Table 2 shows the NWC-2 error data ordered by flight date. 
Fatal, critical and non-critical errors were assigned a 1, 2 or 
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TABLE 2 


NWC-2 FLIGHT DATES 
No. of No. of 


Flight Date 

Flight hours 

V i. 

Errors 

Type of Error 

May 77 

11829 

1 

3 

June 77 

13338 

1 

3 

July 77 

11536 

1 

3 

Aug 77 

13697 

0 


Sep 77 

12639 

0 


Oct 77 

12353 

0 


Nov 77 

12393 

0 


Dec 77 

10485 

0 


Jan 78 

11129 

0 


Feb 78 

12663 

0 


Mar 78 

14586 

0 


Apr 78 

12329 

0 


May 78 

9613 

0 
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3, respectively. Only three non-critical errors occurred in 
this software. Since the exact time of the occurrences is not 
known, it is assumed that one error occurred each month for the 
first three months. Figure 1 is a plot of the failure rates 
for these three months. Since the correlation coefficient is 
zero, nothing can be concluded about the relationship of 
failure rate and software reliability from this data. In fact, 
this is really an ideal case, because the errors detected were 
correctly fixed and the software has been operating 
successfully ever since. The remarkable MTBF of 4062 hours and 
FR of .0002, can partly be explained by the fact that the 
diversity of aircraft this software was run on, tended to 
isolate errors not captured by the final test and evaluation 
phase of the development cycle. 

The NWC-3 data required some organization before analysis 
could be initiated. The 87 discrepancy reports were divided 
into three groups depending on whether they were from actual 
test flights, the weapons laboratory, or from the simulation 
laboratory. A total of 62 errors occurred during the period of 
5753 hours of data made available. Table 3 shows the breakdown 
of error types for the three data sets. Note that non-critical 
errors occurred most frequently, followed by critical and 
finally by fatal errors. 

Table 3 also shows that MTBF is lowest for actual flight 
time and highest for the simulation laboratory. When arranged 
by code. Table 4 shows that the software was more reliable as 
each code was released. Note that codes run on all three 
systems, again show the highest reliability at the simulation 
laboratory. This probably indicates that simulation tests do 
not detect as many errors as actually flying the software, and 
that failure rate for all NWC-3 data combined is not a 
meaningful statistic. 
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TABLE 3 


NWC-3 ERROR TYPES 


ACTUAL FLIGHT TIME 


Type of 

Error 

No. of Errors 

MTBF 

1 


1 

388.0 

2 


3 

129.33 

3 


15 

25.87 


Total 

19 

20.42 


WEAPONS LABORATORY 


2 


7 

172.43 

3 


26 

46.42 


Total 

33 

36.58 


SIMULATION LABORATORY 


2 

3 


Total 


2 

8 


2080.0 

520.0 


10 


410.0 


FR 

0.0026 

0.0077 

0.0387 

0.0490 


0.0058 

0.0215 

0.0303 


0.0005 

0.0019 

0.0002 
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TABLE 4 


NWC-3 ERROR DATA BY CODE 


ACTUAL FLIGHT TIME 


Code FR 


• FAILURE RATE 

MTBF Type 1 Type 2 Type 3 


6B 

0.25 

4.0 

6D 

0.375 

2.667 

6E 

0.1429 

7.0 

6F 

0.12 

8.333 

7A 

0.125 

8.0 

8A 



8B 

0.0174 

57.47 

8C 

0.025 

40.0 

8D 

0.0252 

39.68 


0.0058 


0.125 

0.0476 

0.0416 


0.25 

0.25 

0.0952 

0.12 

0.0833 

0.0116 

0.025 

0.0252 


WEAPONS LABORATORY 


6B 

0.04 

25.0 

6D 

0.025 

40.0 

6F 

0.083 

12.005 

7A 

0.02 

50.0 

8A 

0.041 

24.39 

8B 

0.018 

55.55 


0.0081 0.0324 

0.003 0.0240 

0.0833 
0.0042 0.0167 

0.0167 0.0250 

0.0042 0.0042 


SIMULATION LABORATORY 


6B 

0.0031 

322.58 

6D 

0.0063 

158.73 

6E 

0.0031 

322.58 

7A 

0.0028 

357.14 

8A 

0.0007 

1428.57 


0.0016 

0.0014 


0.0016 

0.0063 

0.0014 

0.0028 

0.0007 
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Table 5 is a summary of the data by month. For example, 
one software failure was detected during 7/78, zero during 
8/78, six during 9/78 etc. In addition, it can be seen that 
the numbers of errors decreases in each successive release and 
that those that occurred were non-critical . Figures 2-4 are 
plots of the failure rate for each code run on the three 
systems. The correlation coefficient for the actual flight 
time data shows a strong negative correlation of failure rate 
with time. Any correlation between failure rate and time is 
negligible for the weapons laboratory. The simulation 
laboratory shows the highest correlation, although as mentioned 
earlier, it is not clear that this is valid data for predicting 
software reliability. The same conclusions emerge when failure 
rate is plotted by month, as seen in Figures 5-7. No 
additional information was yielded when failure rate was 
plotted for each error type separately. 

Jet Propulsion Laboratory (JPL) 

The best error information JPL could provide was the total 
number of each type of error and the type of fix taken. These 
are listed in Table 6. Since the time of occurrence of most of 
the 154 errors was not transmitted, very little analysis was 
possible. Failure rate was determined for all errors occurring 
during a given week and plotted on Figure 8. The MTBF was 
calculated to be 63.35 hours and the FR 0.0158. Although 
Figure 8 shows a slight positive correlation between failure 
rate and time, more information on the type of errors that 
occurred would have been required in order to draw conclusions 
regarding the quality of the software. 
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TABLE 5 


NWC-3 ERROR DATA SUMMARY 


Date 


No. of 

Software 

Failures 


Flight No. of Errors 
Time Type 

( Hours ) _1 2 2 


FR 


FAILURE RATE 
Type 

M TBF 1 2 3. 


KJ 

to 


ACTUAL FLIGHT TIME 


7/78 

1 

4.0 

9/78 

6 

29.0 

10/78 

3 

25.0 

11/78 

2 

9.0 

12/78 

1 

15.0 

1/79 

3 

189.5 

3/79 

3 

114.5 

WEAPONS LABORATORY 


7/78 

10 

247.0 

9/78 

9 

336.0 

10/78 

2 

24.0 

11/78 

5 

240.0 

1/79 

7 

360.0 

SIMULATION 

LABORATORY 


7/78 

2 

320.0 

8/78 

1 

320.0 

9/78 

2 

640.0 

10/78 

2 

720.0 

12/78 

2 

720.0 

1/79 

1 

1440.0 


1 

2 4 

3 

1 1 

1 
2 
3 


2 8 

1 8 

2 

1 4 

3 4 


1 1 
1 
2 

1 1 
2 
1 


0.25 

4.0 



0.25 

0.2069 

4.833 


0.069 

0.1379 

0.12 

8.333 



0.12 

0.222 

4.5 


0.111 

0.111 

0.0667 

15.0 



0.0667 

0.0158 

63.167 

0.005 


0.0102 

0.0262 

38.167 



0.0262 


0.0405 

24.7 

0.008 

0.0324 

0.0268 

37.3 

0.003 

0.0283 

0.0833 

12.0 


0.0833 

0.0208 

48.0 

0.004 

0.0167 

0.0194 

51.429 

0.008 

0.0111 


0.0063 

160.0 

0.003 

0.0031 

0.0031 

320.0 



0.0031 

360.0 


0.0031 

0.0031 

360.0 

0.001 

0.0014 

0.0031 

360.0 


0.0031 

0.0007 

1440.0 


0.0007 


) 


) 
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TABLE 6 


JPL ERROR CLASSIFICATION 


Type 


Description No. of Errors 


Not a problem 10 
Not worth fixing 15 
Source code fix 48 
Critical-make patch 34 
Under investigation 6 
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V. CONCLUSIONS 


Failure rate appears to be a useful statistic for 
estimating software quality and recognizing trends in the 
reliability of operational avionics software. Although the 
NWC-2 data is summary type data for a large sample of aircraft, 
and thorough reliability measurement analysis requires data by 
individual aircraft, a figure of merit may be associated with 
the software at the time of its release to operational units 
since the failure rate is decreasing with increasing 
development time. While the Naval Weapons Center provided 
excellent statistical data during the final test and evalution 
phase of the NWC-3 code, diagnostic efforts continued 
throughout the operational acceptance testing and use, so that 
a true operational figure of merit would not be available until 
several months after the operational release. Because failure 
rate decreases with each successive code release, there is an 
implication that the code is continually maturing. Preliminary 
results on the ground base computer system for the Voyager 
would tend to indicate somewhat inferior software quality, yet 
the system is highly functional. The degree of software 
quality a system needs, is a question that must be answered in 
the future. 

The data available for this study was clearly insufficient 
for any detailed reliability measurement analysis. Collection 
of data would ideally come from an operational environment 
where continuous maintenance and service records were kept. 

Such an opportunity may well exist within the military, 
provided an agreeable data collection and transmission protocol 
can be established, and security conflicts resolved. 
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